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Abstract 

Cytoplasm contains important metabolism reaction organelles such as mitochondria and chloroplast (in plant). In particular, 
mitochondria contains special DNA information which can be passed to offsprings through maternal gametes, and has been 
confirmed to play a pivotal role in nuclear activities. Experimental evidences have documented the importance of cyto- 
nuclear interactions in affecting important biological traits. While studies have also pointed out the role of interaction 
between imprinting nuclear DNA and cytoplasm, no statistical method has been developed to efficiently model such effect 
and further quantify its effect size. In this work, we developed an efficient statistical model for genome-wide estimating and 
testing the cytoplasmic effect, nuclear DNA imprinting effect as well as the interaction between them under reciprocal 
backcross and F2 designs derived from inbred lines. Parameters are estimated under maximum likelihood framework 
implemented with the EM algorithm. Extensive simulations show good performance in a variety of scenarios. The utility of 
the method is demonstrated by analyzing a published data set in an F2 family derived from C3H/HeJBir and C57BL/6 J 
mouse strains. Important cyto-nuclear interactions were identified. Our approach provides a quantitative framework for 
identifying and estimating cyto-nuclear interactions subject to genomic imprinting involved in the genetic control of 
complex traits. 
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Introduction 

One of the central foci in biological study is to unravel the 
genetic secrets of complex traits of agricultural, evolutional and 
biomedical importance. Quantitative trait locus (QTL) mapping 
has been the major tool for this purpose over decades [1-3]. In 
QTL mapping, the identified QTLs are chromosome segments 
harboring potential genetic variants that could give rise to 
phenotypical manifestation. Large successes have been witnessed 
in the past [4]. However, there are still many phenomena that 
could not be explained by Mendelian genetics, leading to the new 
exploration of research focus on epigenetics [5] . 

Genomic imprinting, one of the major epigenetic phenomena, 
plays key roles in controlling embryonic growth and development 
[6,7]. Let subscript letter M and F denote the parental origin of 
two alleles in a diploid organism, then a locus with two alleles A 
and a is thought to be imprinted if two heterozygotes AmCIf and 
OmA-f have different expressions [8]. The malfunction of 
imprinted genes could potentially lead to abnormal characters 
such as cancers or other genetic disorders [9] . 

Genomic imprinting effect is considered as one type of parent- 
of-origin effect due to allelic effect with specific parental origin. In 
contrast to this, maternal effect or cytoplasmic effect is also 
considered as one type of parent-of-origin effect in which the 
offspring's expression is influenced by maternal parent. For 
example, a mother's genotype, even if not transmitted to her 
offspring, may influence in utero conditions and increase risk and/ 



or interact with genetic predisposition for particular diseases 
among those offspring [10]. For cytoplasm, it contains a wide 
variety of organelles such as mitochondria and chloroplast (in 
plant). Almost all the reactions of cellular metabolism take place in 
such an environment. It has been demonstrated that cytoplasm 
plays a central role in coordinating the activities of nuclear genetic 
materials [11-15]. Thus, the identification of cyto-nuclear 
interaction could shed novel insights into the genetic and 
epigenetic control of phenotypic variation. A number of empirical 
studies have documented the significant contribution of cyto- 
nuclear interaction to phenotypic variation in organisms such as 
wheat, rice, mice, yeast and Drosophila [16-19]. 

On the other hand, the existence of such parent-of-origin effects 
may lead to incorrect interpretation of the (marginal) effects of 
particular genes when performing genetic mapping studies, unless 
such effects are appropriately accounted for in the analysis [20]. 
Statistical methods for dissecting genomic imprinting effect has 
been extensively studied in literature (e.g., [21-26]). Tang et al. 
[27] developed a model to evaluate cyto-nuclear interaction effect 
based on experimental crosses. However, how the two types of 
parent-of-origin effects, one in nuclear level and one in cytoplas- 
mic level, interact to influence phenotypic variation is largely 
unknown due to the lack of proper statistical models. 

In this work, we discuss potential scenarios of parent-of-origin 
effects, and present a statistical method to dissect the cyto-nuclear 
interaction effects subject to genomic imprinting. The developed 
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framework is based on experimental crosses which can be realized 
through two different designs, reciprocal backcross and F2 design. 
When an F2 design is considered, sex-specific difference in 
recombination fractions, which is initially discovered in [28,29] 
and later observed in many species, is incorporated into our model 
to distinguish the genetic differences between two reciprocal 
heterozygote. Such information can be found in literature, such as 
the female-to-male recombination ratio of 1.6:1 for human [30], 
1.4:1 for pig [31], 1.4:1 for dog [32], 1.25:1 for mouse [33] on 
average across the whole genome. A genome-wide scan for the 
identification of iQTL mapping cyto-nuclear epistasis is performed 
based on the interval mapping theory, and parameters are 
estimated based on the framework of maximum likelihood method 
implemented with the EM algorithm [34]. Extensive simulations 
are conducted to evaluate the performance of our model under 
different scenarios, such as different sample sizes, different 
heritabUity levels, and different gene effects. The utility of the 
method is illustrated by applying it to a genome-wide scan of four 
traits in an F2 family derived from two inbred mouse strains. 

Statistical Methods 

Genetic designs 

Consider a design initiated with two inbreed lines with two 
segregating alleles A and a. Let subscript letter M and F represent 
the parental origin of offspring alleles inherited from the mother 
and father, respectively. A complete dissection of the cyto-nuclear 
interaction subject to imprinting needs experimental designs that 
can distinguish the quantitative variation between two heterozy- 
gotes Af^ap and a^Af, and also against the cytoplasmic effect. 
For this purpose, a reciprocal backcross or Fj design is proposed so 
that variations and interactions can be fuUy introduced. 

Figure 1 illustrates the reciprocal backcross design. Let the 
maternal parent carrying genotype AA (denoted as P2) has a 
cytoplasmic effect in contrast to the maternal line carrying 
genotype aa (denoted as PI) as the reference line. In the diagram, 
individuals that carry the maternal effect coming from the 
maternal parent with genotype AA are denoted by gray squares. 
Four possible backcrosses can be initiated as illustrated in Fig. 1. 
As shown in the figure, any backcross offsprings coming from the 
middle two designs carry the cytoplasmic maternal effect derived 
from the A A genotype. 

Figure 2 shows the F2 design. Denote the left one as design 5i , 
and the right one as design 52- The cross of two Fj's generates four 
possible allele-specific F2 genotypes. Assuming there is a cytoplas- 
mic effect, F2 offsprings may show different phenotypes depending 
on the genotype of the maternal parental lines. For example, if 
maternal cytoplasmic effect exists, the offspring phenotypic value 
for AA genotype may be different depending on whether it comes 
from the 5i or ^2 design. In the F2 design, the two reciprocal 
heterozygotes AmCIf and OmAf cannot be distinguished in 



general. Sex-specific recombination difference in male and female 
needs to be considered in order to distinguish the two (Cui et al. 
2006) [25]. 

Statistical parameterization 

For a particular cross, let yj (/' = 1 , • ■ ■ ,n) denote the phenotypic 
value of interest. Following Tang et al. [27], the one-QTL genetic 
model can be expressed as, 

yj = /i + cxy + ax2j + dxy + ix4j + i„,xsj + icdX(,i + UiX-jj + ej ( 1 ) 

where fJ. is the overall mean; c is the cytoplasmic effect; a, d and / 
are the additive, dominance and imprinting effects of a QTL, 
respectively; and ica, hd and id are the cytoplasm by additive, 
cytoplasm by dominance and cytoplasm by imprinting interac- 
tions, respectively; Xy is an indicator variable, denoting Xy = 1 for 
the AA maternal cytoplasm and Xy = — 1 for the aa maternal 
cytoplasm; X2j,XT,j, ■ ■ ■ ,Xy are other indicators of various effects 
describing the additive, dominance and the interaction effect 
between the cytoplasm and genetic variables. 

For the Si design in the F2 population initiated with cross 
AA X aa, the mean genotypic values of four possible genotypes 
formed by different allelic combination from the two Fi parents 
can be expressed as. 



' fii=fi + c + a + !,.a, tor AmAp, 

H2 = H + c + d + i+icd + ici, ioT Auap, 

l,i^=H + c + d — i+icd — ici, for auAp, 

lin = ji + c — a — ica, forflMflF- 



(2) 



Similarly for the 1S2 design initiated with cross aa x AA, the 
genetic model can be expressed as. 



^ii=j-i — c + a — icc 
H^ = H — c + d + i- 
fi-j = H — C + d — i- 
. Ms = M-c-a + in 



, for AmAp, 

' ^cd ^c/i 

for AMap, 
-icd + ici, ioT auAp, 
, for amap. 



(3) 



For the reciprocal backcross design which consists oiBi, B2, S3 
and B4, the indicator variables in Eq. (1) describing different QTL 
genotypes are defined in Table 1. 

For simplicity, we will use matrix form to rewrite the models. 
Let us denote 



/} = (ti,c,a,d,i,ica,icd,icd , fi = {H\,ii24h4k4'-s,^b,ti-i,^i) 
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Figure 1. A reciprocal baclccross design. 
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Figure 2. A reciprocal F2 design. 

doi:10.1371/journal.pone.0091702.g002 
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eight parameters can be written as 
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(4) 



(5) 



(6) 



for the F2 design. For the purpose of illustration, the following 



estimation and inference is demonstrated through the F2 design. 
The same procedure applies to BC design too. 

The mixture model and the likelihood 

Statistical methods for QTL interval mapping based on a 
mixture model traced back to the work by Lander and Botstein 
[1]. In the mixture model, each observation y is modeled as a 
weighted mixture of / (known and finite) components, and each 
component, which corresponds to a certain genotype category 
depending on the underlying genetic design, follows a certain 
distribution fj with weight nj. Conditional on the marker genotype 
M and unknown parameters (j) and the density of the observed y 
has the following expression 

y~p(y\M,(l),,i) = n^fi(y\M,(l>i,ri) + ... + njfj(y\M,<l>j,,il (7) 



where n - 



refers to the mixture proportions which are 
J 1. J._/J, J. \T 



= (7Ii,...,7Iy) 

„ ' ' 7t/ = i; <p=Wi,-,'Pj)' IS a 

vector for the component-specific parameters, with being 
specific to yth component; and rj consists of parameters (i.e., 
residual variance) that are common to all components. 

For the F2 design we described above, there are four genotypes 
at each locus [AmAf, Amcif, omAf, and umcif)- The genotype of 
the QTL is generally unobservable, but can be inferred by using 
the two flanking markers' information. Given the flanking marker 
genotypes of the ith individual, the conditional probabilities 
'ti = i'''^AMAr\h^AMaF\i-'^a„AF\i,''^aMar\i) of the QTL genotype can be 
calculated. These conditional probabilities become the mixture 
proportions in the mixture model (7). Let us denote 
Jti = {Ti\\i,n2\i,nni,n4ii) to simplify the notation. These conditional 
probabilities are expressed in terms of sex-specific recombination 
rates in order to distinguish the two reciprocal heterozygotes. 
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Please refer to Cui et al. [25] for the conditional probabilities of 
QTL genotypes given marker genotypes in terms of sex-specific 
recombination fractions for an design. 

Assume the total number of F2 offsprings for design 5'i and ^2 
are ni and «2 respectively, and let n = n\ +M2- Phenotype data for 
a certain quantitative trait can be observed and recorded as a 
vector y = {y\,...,yn), where y*'* = (yiv>J'Hi) are from design 5'i 
and y'^* = (>'«, + i, •••:>'«) are from design 52. Marker information 
can be reorganized as matrix M =(Mi ; M2)^, where the y'tli 
{] = \,...,n\) row of Mi (the y'tli row of matrix M) contains all the 
marker information of the y'th individual under design S\ and the 
A:th (k=\,...,n2) row of M2 (the («i+^)th row of matrix M) 
contains all the marker information of the kth individual under 
design 52. Based on the mixture model (7), and with the 
independence assumption, the joint likehhood function for the 
F2 family with total n individuals, constructed by combining design 
S\ and 52 together, can be formulated as, 

L(0|M, y)= ri [ni |/i0',) + K2|/2(F,) + 7r3|/3(>',) + 't4|,/40'.)l 

(8) 

X n [7Il[(/5(>'/) + 'r2|/6(j'/) + 7r3|/70'i) + 'l4|i/8(>'/)l, 
/ = nj +1 

where the unknown vector 0 contains the QTL position, QTL 
effects and residual variance, and the density function fj (/' = 1,...,8) 
is assumed to foUow a normal distribution with mean fi^- and 
common variance . More specifically, parameter vector 0 can be 
divided into two subsets, 0/ and Qg, where 0/ describes the location 
of QTL and Qg contains all the genetic parameters, including 
QTL-efFects vector fi = {^,c,a,d,i,ii.aJcd4ci)^ and residual variance 
(T^ in our model. 

Parameter Estimation 

To estimate the unknown parameters 0=(0/,0j,), several 
algorithms could be implemented, such as Expectation-Maximi- 
zation (EM), Newton Raphson and Fisher Scoring. Among all 
these methods, EM algorithm, which was initially developed by 
Dempster et al. [34], is most commonly used in QTL mapping 
study. In this paper, EM algorithm is apphed to obtain the 
maximum likelihood estimates (MLEs). This procedure involves 
differentiating the log-likelihood function with respect to each 
unknown parameter, letting the derivatives equal to zero, and 
solving the log-likelihood equation for the corresponding param- 
eter. Please read Appendix SI for the detailed derivations of 
parameters estimation and the algorithm. 

For the QTL position which is unknown in the model, we did 
not estimate parameters 0/ directly. As commonly treated in QTL 
mapping studies, we applied a grid search approach to estimate 
the putative QTL position via scanning the entire linkage genome 
by 1 or 2 cM increment flanked by two markers and did a 
hypothesis testing at each putative position. A likelihood ratio or 
LOD profile plot can be generated to graphically display the LR 
or LOD test statistic for a putative QTL at each testing position. 
The genomic position which corresponds to a peak in the profile 
plot is the MLE of the QTL location, given that the peak passes 
the genome-wide threshold determined by the permutation tests 
detailed below. Bootstrap methods can be applied to assess the 
confidence interval of the estimated position [35]. 

Hypothesis Test 

Testing the overall QTL effect on the quantitative trait is the 
frrst step toward a complete dissection of difiTerent genetic 



contributions to the trait. Once the MLEs of the parameters are 
obtained, the presence of QTL responsible for the variation of the 
quantitative phenotype can be tested by using the following 
hypotheses, 

{Mq '. ci = d = i = = = i^i = 0 
Hi : not all equal zero 

The test statistic for testing the above hypotheses is calculated as 
the log-likelihood ratio test statistic (LR) of the full model {Hi) over 
the reduced model {Ho): 

LR=-2log[m)-m)] (9) 

where q and fj denote the MLEs of the unknown parameters 
under Hq and Hi , respectively. Since a genome-wide scan involves 
multiple correlated tests, we use the permutation tests proposed by 
Churchill and Doerge [36] to find the threshold value. 

If there is a QTL, a number of other hypothesis tests can also be 
performed to test the property of the detected QTL. To test the 
imprinting effect, we can simply formulate the hypothesis as 
Hq : i = 0 vs Hi : ij^O to assess the mean difference of the two 
reciprocal heterozygotes. To test the cytoplasmic maternal effect, 
the hypothesis can be stated as Hq : c = Q vs Hi : c#0. The 
epistatic effects of all interaction terms can also be tested as 

{Ho tea = icd = hi = 0 

Hi : not all equal zero 

Similarly, additive and dominance effects can be tested as 

(Ho: a = d = 0 

\ Hi : not all equal zero 

If specific interest is focused, for instance, the interaction of 
imprinting and maternal effect, the hypothesis can be formulated 
as Ho : 4i" = 0 vs Hi : ici¥=0- All the above tests can be done by 
applying the likelihood ratio test in which the test statistic 
asymptotically follows a distribution with degrees of freedom 
equal to the difference of the parameters under the nuU and the 
alternative hypotheses. For example, when testing Hq : a = d = 0, 
the LR test statistics is compared with the X2 cutoff with 2 degrees 
of freedom. 

Results 

Monte Carlo simulation 

Monte Carlo simulations were performed to investigate the 
statistical behavior of our model. W e simulated an F2 population, 
with one half of the population coming from design 5i and the 
other half coming from design 52 . A genome with 100 cM long 
linkage group, composed of 6 equidistant markers, was construct- 
ed. The position of QTL was assumed to be located at 48 cM 
away from the first marker on the linkage group. The marker 
genotypes in the F2 population were simulated by mimicking sex- 
specific recombination fractions in mice, i.e., = i-25rp [33]. A 
series of simulation study with different sample size (« = 400 vs 
« = 800) and different heritability levels (//^ =0.1,0.25,0.4) was 
conducted to examine the impacts of parameter spaces on 
parameter estimation and testing power. These simulation designs, 
which were aimed to give a better understanding of model 
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performance under different situations, can provide biologists 
some empirical evidences to design tlieir experiments. 

In the simulation study, the residual variance was calculated 
under different heritability levels. For the F2 design, the genetic 
variances of various terms can be calculated as follows: f7^ = a^/2, 

i^Jl + cicd, and the broad sense heritability level can be expressed 
as 

H^ = cjII{cjI + cjI + cjI). 

For given genetic parameters and the heritability level, the residual 
variance can be calculated as al = {(7^ + alQ){l / — l) — al, 
from which the phenotype data can be generated. 

The MLEs of the QTL position and effect parameters, based on 
200 simulation replicates under different heritability levels and 
sample sizes, are displayed in Table 2. The square root of mean 
squared error (RMSE) of parameter estimates are given in 
parenthesis to show the estimation accuracy. As we expected, 
the accuracy of parameter estimates increases as the sample size 
and heritability level increase. For instance, the RMSE of 
estimated QTL position decreases from 13.28 to 3.81, an 71% 
increase in accuracy when the sample size increases from 400 to 
800 under a fixed heritability level 0.1. The other parameter 
estimates show the same pattern. If we increase the heritability 
level when the sample size is fixed, a clear reduction in RMSE can 
be observed. For example, with 400 samples, the RMSE of 
estimated QTL position decreases from 13.28 to 4.99, then to 2.77 
as gradually increases from 0.1 to 0.4. From the decreasing 
RMSEs of the parameter estimates, we observed that simply 
increasing sample size is less efficient than increasing heritability 
level in order to increase the precision of parameter estimation. 
Since high heritability corresponds to small environmental 
variability [37], reducing environmental variation is of more 
practically important than just simply increasing sample size. 

Note that we did not list the estimation of the imprinting effect i 
and cyto-imprinting interaction id, which are not estimable under 
the F2 design. The reason is that the imprinting direction cannot 



be inferred from the F2 design [2.5]. However, we can still conduct 
hypothesis test to infer the imprinting effect as well as its 
interaction with cytoplasmic effect. To further investigate the 
testing performance of cytoplasmic and imprinting effects, we 
introduced two proportions, namely rj^ and rjj, where >l^ = al/aQ 
and r\i = a^Ja^Q. We can evaluate the test power under different 
cytoplasmic and imprinting effect sizes. Given all other genetic 
parameter values fixed (as shown in Table 2), simple algebra shows 
that the cytoplasmic effect c and imprinting effect i can be 
calculated for a given value of 17^ or )/,, i.e., 

c = , and i = I 

where Qi = a], + ff^ + ^2 + 4/2 + f-Jl + ilJl, and 

Qi = <^ + 4/2 + 4/2 + 4/2 + cicd. 

Based on 1000 simulation runs under different heritabilities, 
sample sizes and variation proportions, the power of cytoplasmic 
effect test, imprinting effect test, interaction effects test and 
additive/dominance effects test are listed in Table 3. As we expect, 
the test power increases with the increasing of sample size and 
heritability level. For example, the cytoplasmic testing power 
increases from 0.663 to 0.905 as sample size increases from 400 to 
800, a 36.5% increase in power for frxed 17 ^, = 0.1 and H^ = 0.25. 
The same pattern is observed for the imprinting test. As the 
proportion of variance explained by the cytoplasmic and 
imprinting effect increases, the power increases accordingly. 
Noted that when both t]^ and (7, are zeros, the testing power 
corresponds to the type I error rate for the corresponding factor. 
From the table we can see that the size of imprinting test is well 
controlled under different sample sizes and heritability levels. For 
the cytoplasmic effect, the size is a litde inflated under n = 400, but 
it gets close to the nominal 5% level as sample size increases to 
800, especially under large heritability (e.g., = 0.4). In sum, the 
simulation evidences show that the model performs reasonably 
well in both parameter estimation and testing. 



Table 2. The MLEs of the QTL position and effect parameters based on 200 simulation replicates under different heritabilities and 
sample sizes. 
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(0.04) 



The squared roots of the mean squared errors (RMSE) of the MLEs are given in parentheses. 

The locations of the QTL is described by the map distances (in cM) from the first marl<er of the linl<age group. The hypothesized value is 3.81 for =0.1, 2.04 for 
i/- = 0.25 and 1.26 for H^ = OA. 
doi:l 0.1 371/journal.pone.0091 702.t002 
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Table 3. The power of four hypothesis tests based on 1000 samplings under different heritabilities, sample sizes and variation 
proportions. 



n 




Power^ 






Power^ 






Power-^ 


Power'^ 






(/,, = 0% 


f;,,= 10% 


f(, = 20% 


/),. = 0% 


v,- = io% 


i;,- = 20% 






400 


0.10 


0.088 


0.353 


0.528 


0.066 


0.085 


0.089 


0.654 


0.678 




0.25 


0.081 


0.663 


0.928 


0.051 


0.263 


0.432 


0.990 


0.992 




0.40 


0.064 


0.930 


0.998 


0.050 


0.807 


0.942 


1.000 


1.000 


800 


0.10 


0.073 


0.464 


0.753 


0.059 


0.100 


0.126 


0.898 


0.927 




0.25 


0.066 


0.905 


0.998 


0.047 


0.473 


0.713 


1.000 


1.000 




0.40 


0.049 


0.997 


1.000 


0.050 


0.980 


0.997 


1.000 


1.000 



Power* {k = 1 ,2,3,4) refer to the powers for testing 1 ) i/o : f = 0 vs i/i : c # 0; 2) //q : /' = 0 vs //] : z # 0; 3) : /„ = icd = id = Ovs H, : not all equal lo 0; and 4) 
Hq : a = d = Ovs Hi : not all equal to 0, respectively. For a given r]^, all other effect values are fixed as 0.8 except for c, which can be calculated in terms of z;^. and other 
parameters. The hypothesized c value is 0 for z/^. = 0, 0.461 for z;^, = 0.1 and 0.679 for Z7^, = 0.2. Similarly, the value of z, which depends on imprinting effect variation 
proportion z),, is 0 for z(, =0, 0.680 for z(, = 0.1 and 1.020 for z/, =0.2. 
doi:l 0.1 371 /journal.pone.0091 702.t003 



A case study 

We applied the model to a published F2 cross data set based on 
design Si and design 52 aimed to find QTLs that contribute to 
variation in quantitative traits related to colitis severity in lL-10- 
deficient mice [38]. The data contain 411 mice derived from 
inbred strains, where 203 mice are from design 5i and 208 mice 
are from design ^2. Ninety-one markers were obtained with an 
average length of ~ 1 5 cM spanning acroos the 1 9 autosome 
chromosomes. For more information about the data, the readers 
are referred to the original paper [38] . 

It has been reported that on average the female chromosome is 
25% longer in genetic distance between homologous loci than the 
male in mice [38]. The sex-specific recombination fraction, 
expressed as = l.25rp, was reconstructed based on the marker 
information (see Cui et al. for details [25]). The method was 
applied to four phenotypes, cecum total score (CTS), spleen/body 
weight ratio (SBWR), mesenteric lymph node(MLN)/body weight 
ratio(MBWR), and secretory IgA (SIgA) level, where cecum total 
score was graded by using colitis-related criteria, including 
severity, hyperplasia, ulceration and the percentage of area 
involved. Box-Cox transformation was applied to all traits before 
fitting the Gaussian-mixture model. The genome-wide LOD 
profile plots for the four phenotypes are shown in Figure 3, where 
the solid blue curves correspond to the LOD values and the 
dashed red lines correspond to the 5% genome-wide threshold 
values out of 1000 permutations. The LOD score is calculated as 
logigLR, where LR is obtained from equation (9) to test the null 
hypothesis: Hq : a = d = i = ica = icd = ici = ^- 

As shown in Figure 3, one QTL on chromosome 3 is detected 
for cecum total score trait, two QTLs on chromosomes 3 and one 
on chromosome 1 are detected for spleen/body weight ratio trait, 
three QTLs on chromosome 3 are detected for MLN/body weight 
ratio trait, and two QTLs located on chromosome 3 are detected 
for SIgA trait. The QTL located at 60.6 cM on chromosome 3 is 
common to three traits. The one located at 52.6 cM for SIgA trait 
is very close to it. It is highly possible that it is the same QTL that 
controls the four traits. Such a pleiotropic effect needs to be further 
evaluated. It should be mentioned that the QTL detected in the 
original paper for the four traits is located at 61.8 cM on 
chromosome 3 [38], which is 1.2 cM away from the one we found. 
Such a difference may arise because of the capitalization of sex- 
specific recombination rates and different models fitted. 

In addition to the QTL identified in our analysis and the 
original paper, some other major QTLs which are not detected in 



Farmer et al. [14], such as those at 52.6 cM and 38.6 cM on 
chromosome 3, stand out in our model and therefore need further 
investigation. Almost aU the QTLs on chromosome 3 are clustered 
together, whose local LOD profiles of four traits are shown in 
Figure 4. 

The marker interval for each QTL is listed in Table 4, which 
also tabulates the p-values under four different tests for each 
estimated QTL using permutation tests. From the test results, it 
can be seen that most QTLs have strong additive and dominance 
genetic effect, except for the spleen/body weight ratio QTL 
located at DlMit 156+3 1.1 cM on chromosome 1. This QTL 
shows evidence of cytoplasmic, imprinting as well as cyto-nuclear 
interaction effects (p-values for the three tests are 0.0282, 0.0296 
and 0.0073, respectively), but shows no sign of additive and 
dominance effect. In addition, the MLN/body weight ratio QTL 
located at D3Mit78+5.6 cM shows evidence of cytoplasmic effect. 
In summary, we identified one QTL on chromosome 1 with 
evidence of cyto-nuclear interaction effect and this QTL also 
shows evidence of cytoplasmic and imprinting effect. Further 
functional validation is needed to confirm the results. 

Discussion 

The cytoplasmic environment influences the expression of 
nuclear information in a very complicated way, which is stiU an 
unravel mystery to many organisms. For example, Burgess and 
Husband have demonstrated great maternal contributions to the 
fimess of mulberry hybrids [39]. While it is an important parent-of- 
origin effect affecting offspring fitness, genomic imprinting, another 
source of parent-of-origin effect can also lead to phenotypic 
variation. Increasing evidence from cytoplasmic substation and cell 
fusion experiments suggests that weakness of hybrids may connect 
with the interactions between cytoplasm and nuclear [40,41], and 
the evidences about interaction between cytoplasm and imprinting 
have also been observed [42,43]. As the source of genetic variation 
for many traits, these two types of parent-of-origin effects are often 
confounded, making it difficult to distinguish without proper 
statistical dissection. Although the role of cross-talk between the 
two sets of factors on phenotypic variation has been recognized, 
which genes are involved in the process and in what form they 
respond to the cytoplasmic changes are still unclear. 

In this paper, we developed a statistical model to evaluate the 
cytoplasmic environment and nuclear gene interaction subject to 
imprinting effect within the framework of QTL interval mapping. 
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Figure 3. The LOD profiles of the four traits across the 19 chromosomes using the iinicage map constructed from microsatellite 
mariners [38]. The genomic positions corresponding to the peal< of the curve are the MLEs of the QTL locations. 
doi:1 0.1 371 /journal.pone.0091 702.g003 





Figure 4. The local LOD profiles of the four traits across chromosomes 3. 

doi:1 0.1 371/journal.pone.0091 702.g004 
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Table 4. List of QTL positions, corresponding marker intervals and p-values under different tests for the four traits. 



Trait Ch 


QTL postlon 


Marker interval 


p-value^ 


p-value^ 


p-value^ 


p-value* 


CTS 3 


60.6 cM 


D3Mit78-D3Mit348 


0.6680 


0.3046 


0.5005 


<10-i" 


(D3Mit78+5.6 cM) 


SBWR 3 


60.6 cM 


D3Mit78-D3Mit348 


0.4322 


0.6107 


0.4502 


<io-"** 


{D3Mit78+5.6 cM) 


3 


32.6 cM 


D3Mit203-D3Mit212 


0.2003 


0.9502 


0.2700 


<io-"* 


(D3Mit203+11.4 cM) 


1 


63.9 cM 


D3Mitl56-DlMitl7 


0.0282* 


0.0296* 


0.0073* 


0.8064 


(DIMItl 56+31.1 cM) 


MBWR 3 


38.6 cM 


D3Mit203-D3Mit212 


0.3574 


0.2651 


0.1658 


<10-'' 


(D3Mit203+27.4 cM) 


3 


60.6 cM 


D3Mit78-D3Mit348 


0.0102* 


0.9708 


0.4831 


<io-** 


(D3Mit78+5.6 cM) 


3 


52.6 cM 


D3Mitl89-D3Mit78 


0.2120 


0.6421 


0.7095 


<10-*' 


(D3Mitl 89+2.9 cM) 


SIgA 3 


52.6 cM 


D3Mitl89-D3Mit78 


0.1016 


0.6488 


0.3195 


<10-'' 


(D3Mitl 89+2.9 cM) 


3 


38.6 cM 


D3Mit203-D3Mit212 


0.7174 


0.4051 


0.8020 


<io-'* 


(D3Mit203+27.4 cM) 


p-value*^ {k = 1,2,3,4) refer to the p-values obtained with the likelihood ratio tests for testing 1) //o ; c = 0 vs H\ : Cy^O; 2} Hq : / = 0 vs Hi : /#0; 3) Hq 
vs //] : not all equal to 0; and 4) Hq : a = ii = 0 vs Hi : not all equal to 0, respectively. Significant test results are indicated w/ith the "*" sign. 
doi:l 0.1 371 /journal.pone.0091 702.t004 





The model that considers eight genetic factors which measure the 
degree of imprinting, cytoplasmic, additive, dominance effects as 
well as the interaction effects among them, provides a complete 
dissection of cyto-nuclear epistasis subject to imprinting effect. A 
number of hypothesis tests can be performed not only to assess 
major genetic eflFect(s) responsible for phenotypic variation, but also 
to find the statistical evidence for the existence of imprinting, 
cytoplasmic effect as well as the cyto-nuclear interactions. Simula- 
tion study showed relative good performance of the model under the 
F2 design, in which parameters are estimated efiiciendy with modest 
heritabUity and sample size. Low heritabUity level {H^=QA) and 
small sample size (m = 400) result in large mean squared error of 
parameter estimation. This result is valuable in practice as we need 
to be careful about the interpretation of genetic effects obtained in 
real data analysis when the proportion of variance explained by the 
QTL is small (i.e., low heritabUity). Aldiough our model cannot 
estimate the imprinting effect (so the cyto-imprinting interaction 
effect) due to the nature of the F2 design, existence test of imprinting 
(or cyto-imprinting interaction effect) can be achieved. Nevertheless, 
such imprinting estimation problem can be solved under the 
reciprocal backcross design illustrated in Figure 1 . 

In the real data analysis, one QTL located on chromosome 1 
were found to have significant cytoplasmic, imprinting effect and 
cyto-nuclear interaction effects for spleen/body weight ratio. 
Other than that, no imprinting effect was found for all other 
QTLs, and only one on chromosome 3 that shows cytoplasmic 
effect for the MLN/body weigh ratio trait. It is worth mentioning 
that several QTL on chromosome 3 are detected by our model, 
but they are located relatively close to each other, as shown in 
Figure 3. For the sake of cautiousness, we reported all of them. 
However, these detected clustered QTLs may be due to the 
limitation of interval mapping, which can be overcome by fitting a 



composite interval mapping model as following, 
yj = H + cxy + ax2j + dxy + LY4/ + icaXij + icdXij + iciX^j 

K 

+ ^ (akXyk + dkX^jk + ikXAjk + icaj^ xsjk + icdj^ Xb/k + ici^. xyk) + ej 
k=\ 

where Xyk, ■ ■ ■ ,Xyk are corresponding variables for the kth 
marker, assuming total K markers are selected for controlling 
background genetic effect. Although more variables are intro- 
duced in the model, theoretically some dimension-reduction 
techniques such as LASSO, can be applied to implement the 
variable selection for each marker before fitting them into the final 
model [44]. The composite interval mapping is know for its 
improved resolution in QTL detection. Regardless of the potential 
limitations mentioned above, the integration of cyto-nuclear 
interactions into the QTL mapping framework provides a testable 
platform with feasible experimental design for biologists to test the 
existence of cytoplasmic and imprinting effects, as well as the 
interactions of interested. The proposed model will have important 
biological implications with potentials to lift a corner of the great 
veil of the genetic system. 

Supporting Information 

Appendix SI Detailed derivation of the EM algorithm. 

(PDF) 
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